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We discuss the implementation of two distributed solvers of the random K-SAT problem, based on 
£SJ ■ some development of the recently introduced Survey Propagation (SP) algorithm. The first solver, 

called the "SP diffusion algorithm" diffuses as dynamical information the maximum bias over the 
system, so that variables-nodes can decide to freeze in a self-organized way, each variable taking its 
decision on the basis of purely local information. The second solver, called the "SP reinforcement 
algorithm", makes use of time-dependent external forcings on each variable, which are adapted in 
| time in such a way that the algorithm approaches its estimated closest solution. Both methods 

, allow to find a solution of the random 3-SAT problem in a range of parameters comparable with the 

best previously described serialized solvers. The simulated time of convergence towards a solution 
( if these solvers were implemented on a fully parallel device) grows as log(N). 
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I. INTRODUCTION 



\ Recent progress in the statistical physics of disordered systems has provided results of algorithmic interest: the 
£^ ■ connection between the geometrical structure of the space of solutions in random NP-complete constraint satisfaction 
problems (CSP) and the onset of exponential regimes in search algorithms [3, 16] has been at least partially clarified. 
More important, the analytical methods have been converted into a new class of algorithms [17] with very good 
performance [11]. 

t-H , Similarly to statistical physics models, a generic CSP is composed of many discrete variables which interact through 
' constraints, where each constraint involves a small number of variables. When CSPs are extracted at random from 
non trivial ensembles there appear phase transitions as the constraint density (the ratio of constraints to variables) 
3c ■ increases: for small densities the problems are satisfiable with high probability whereas when the density is larger 
than a critical value the problems become unsatisfiable with high probability. Close to the phase boundary on the 
satisfiable side, most of the algorithms are known to take (typically) exponential time to find solutions. The physical 
' interpretation for the onset of such exponential regimes in random hard CSP consists in a trapping of local search 
processes in metastable states. Depending on the models and on the details of the process the long time behaviour may 
be dominated by different types of metastable states. However a common feature which can be observed numerically 
is that for simulation times which are sub-exponential in the size of the problem there exists an extensive gap in the 
number of violated constraints which separates the blocking configurations from the optimal ones. Such behavior 
can be tested on concrete random instances which therefore constitute a computational benchmark for more general 
algorithms. 

In the last few years there has been a great progress in the study of phase transitions in random CSP which 
^ . has produced new algorithmic tools: problems which were considered to be algorithmically hard for local search 
algorithms, like for instance random K-SAT close to a phase boundary, turned out to be efficiently solved by the so 
called Survey Propagation (SP) algorithm [17] arising from the replica symmetry broken (RSB) cavity approach to 
CSP. According to the statistical physics analysis, close to the phase transition point the solution space breaks up into 
many smaller clusters [16, 17]. Solutions in separate clusters are generally far apart. This picture has been confirmed 
by rigorous results, first on the simple case of the XORSAT problem [2, 5], and more recently for the satisfiability 
problem [4, 6]. Moreover, the physics analysis indicates that clusters which correspond to partial solutions — which 
satisfy some but not all of the constraints — are exponentially more numerous than the clusters of complete solutions 
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and act as dynamical traps for local search algorithms. SP turns out to be able to deal efficiently with the proliferation 
of such clusters of metastable states. 

The SP algorithm consists in a message-passing technique which is tightly related to the better known Belief 
Propagation algorithm (BP) [21] — recently applied with striking success in the decoding of error-correcting codes 
based on sparse graphs encodings [7, 8, 12-15] — but which differs from it in some crucial point. The messages 
sent along the edges of the graph underlying the combinatorial problem describe in a probabilistic way the cluster- 
to-cluster fluctuations of the optimal assignment for a variable; while BP performs a "white" average over all the 
solutions, SP tells us which is the probability of picking up a cluster at random and finding a given variable forced 
to take a specific value within it ("frozen" variable). Once the iterative equations have reached a fixed-point of such 
probability distributions (called "surveys" because they capture somehow the distribution of the expectation in the 
different clusters), it becomes possible in general to identify the variables which can be safely fixed and to simplify 
the problem accordingly [11, 17]. This procedure, which is intrinsically serial, is known as the SP-inspired decimation 
algorithm (SID). 

From the experimental point of view, the SID has been efficiently used to solve many instances in the hard region 
of different satisfiability problems and of the graph coloring problem, including instances too large for any earlier 
method [17]. For example, for random 3-SAT, instances close to the threshold, up to sizes of order 10 7 variables were 
solved and the computational time in this regime was found experimentally to scale roughly as TV (log N) 2 . 

In the present manuscript, we show how the SP decimation algorithm can be made fully distributed. This opens 
the possibility of a parallel implementation which would lead to a further drastic reduction in the computational cost. 

The paper is organized as follows: in part II we introduce some notation and the K-SAT problem, in part III we 
recall the SP algorithm along with the serial decimation procedure, and parts IV and V are devoted to explaining our 
new parallel solvers, respectively the first one based on information diffusion and the second one based on external 
forcing on the variables. 

II. THE SAT PROBLEM AND ITS FACTOR GRAPH REPRESENTATION 

Though the results we shall discuss are expected to hold for many types of CSP, here we consider just one repre- 
sentative case : the K-SAT problem. 

A K-SAT formula consists of N boolean variables Xi £ {0, 1} , i G {1, . . . , TV}, with M constraints, in which each 
constraint is a clause, which is the logical OR (V) of the variables it contains or of their negations. 

Solving the K-SAT formula means finding an assignment of the Xi G {0, 1} = {directed, negated} which is such 
that all the M clauses are true. In the literature, such an assignment can indifferently be called solution, satisfying 
assignment or ground state if one uses the statistical physics jargon. 

In the case of randomly generated K-SAT formulas (variables appearing in the clauses chosen uniformly at random 
without repetitions and negated with probabiliy 1/2) a sophisticated phase transition phenomenon sets in : when 
the number of constraints M = aN is small, the solutions of the K-SAT formulas are distributed close one to each 
other over the whole TV— dimensional space, and the problem can easily be solved by the use of classical local search 
algorithms. Conversely, when a is included in a narrow region ad < a < a c , the problem is still satisfiablc but the now 
limited solution phase breaks down into an exponential number of clustered components. Solutions become grouped 
together into clusters which are fart apart one from the other. This range of parameters is known as the hard-SAT 
phase: classical local search algorithms get trapped by strictly positive energy configurations which are exponentially 
more numerous than the ground state ones. 

A clause a is characterized by the set of variables ii, %k which it contains, and the list of those which are negated, 
which can be characterized by a set of K numbers J" G {±1} as follows. The clause is written as 

( Zll V...Vz, r V...Vz, K ) (1) 

where Zi r = Xi r if Jf = — 1 and Zi r = Xi r if Jf = 1 (note that a positive literal is represented by a negative J). The 
problem is to find whether there exists an assignment of the Xi G {0, 1} which is such that all the M clauses are true. 
We define the total cost C of a configuration x = (x%, x/v) as the number of violated clauses. 

In the so called "factor graph" representation, the SAT problem can be schemed as follows (see Fig. 1). Each of the 
N variables is associated to a vertex in the graph, called a "variable node" (circles in the graphical representation), 
and each of the M clauses is associated to another type of vertex in the graph, called a "function node" (squares in 
the graphical representation). A function node a is connected to a variable node i by an edge whenever the variable Xi 
(or its negation) appears in the clause a. In the graphical representation, we use a full line between a and i whenever 
the variable appearing in the clause is Xi (i.e. Jf = —1), a dashed line whenever the variable appearing in the clause 
is Xi (i.e. Jf = 1). 
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Figure 1: A function node a and its neighborhood. The survey of the cavity-bias u a ->i can be computed from the knowledge 
of the joint probability distribution for all the cavity-biases in the set U, i.e. those coming onto all variable nodes j neighbors 
of a (except j = i). 

Throughout this paper, the variable nodes indices are taken in i, j, fc, while the function nodes indices are taken 
in a, 6, c, .... For every variable node i, we denote by V(i) the set of function nodes a to which it is connected by an 
edge, by n; = \V(i)\ the degree of the node, by V+(i) the subset of V(i) consisting of function nodes a where the 
variable appears un-negated -the edge (a, i) is a full line , and by V_(i) the complementary subset of V(i) consisting 
of function nodes a where the variable appears negated -the edge (a, i) is a dashed line-. V(i) \ b denotes the set 
V(i) without a node b. Similarly, for each function node a, we denote by V(a) = V+ (a) U V- (a) the set of neighboring 
variable nodes, decomposed according to the type of edge connecting a and i, and by n a its degree. Given a function 
node a and a variable node j, connected by an edge, it is also convenient to define the two sets: V*(j) and V a "(j), 
where the indices s and u respectively refer to the neighbors which tend to make variable j respectively satisfy or 
unsatisfy the clause a, defined as (see Fig. f): 

if J? = 1 : V:(j) = V+(j) ; V°(j) = V-(j) \ a (2) 
if ^ = -1: V:(j)=V-(j) ; V* (j) = V+(j) \ a (3) 



III. SURVEY PROPAGATION 

The survey propagation algorithm (SP) is a message passing algorithm in which the messages passed between 
function nodes and variable nodes have a probabilistic interpretation. Every message is the probability distribution 
of some warnings. In this section we provide a short self-contained introduction to SP, starting with the definition of 
warnings. We refer to refs. [11, 16, 17] for the original derivation of SP from statistical physics tools. 

A. Warnings 

The basic elementary message passed from one function node a to a variable i (connected by an edge) is a Boolean 

number u a >^ £ {0, 1} called a 'warning'. 

Given a function node a and one of its variable nodes i, the warning u a ^t is determined from the warnings 
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arriving on all the variables j G V(a) \ i according to: 

= n d \- j n E h ( 4 ) 

3*eV(o)\i \ \beV(j)\a / / 

where 9(x) = if x < and 0(x) = 1 if a; > 0. 

A warning u a ^i = 1 can be interpreted as a message sent from function node a, telling the variable i that it should 
adopt the correct value in order to satisfy clause a. This is decided by a according to the warnings which it received 

from all the other variables j to which it is connected: if (^2b^v(j)\a ^j Ub ^j^j Jj < 0' this means that the tendency 

for site j (in the absence of a) would be to take a value which does not satisfy clause a. If all neighbors j G V(a) \ i 
are in this situation, then a sends a warning to i. 

The simplest message passing algorithm one can think of is just the propagation of warnings, in which the rule (4) 
is used as an update rule which is implemented sequentially. The interest in WP largely comes from the fact that it 
gives the exact solution for tree-problems. In more complicated problems where the factor graph is not a tree, it turns 
out that WP is unable to converge whenever the density of constraints is so large that the solution space is clustered. 
This has prompted the development of the more elaborate SP message passing procedure. 



B. The algorithm 

A message of SP, called a survey, passed from one function node a to a variable i (connected by an edge) is a 
real number ?7 a _>i G [0,1]. The interpretation of the survey rj a ^i is a probability among all clusters of satisfying 
assignments that a warning is sent from a to i. 

The SP algorithm uses a random sequential update (see Alg. 1), which calls the local update rule SP-UPDATE 
(Alg. 2): 



Algorithm 1 : SP 

Input : The factor graph of a Boolean formula in conjunctive normal form; a maximal number of iterations 
tmax ; a requested precision e 

Output : UN-CONVERGED if SP has not converged after t max sweeps. If it has converged: the set of all 
messages r/*_i 

1: At time t = 0: 

2: for every edge a — > i of the factor graph do 

3: randomly initialize the messages r}a^i{t = 0) G [0,1] 

4: end for 

5: for t — 1 tO t = t max dO 

6: sweep the set of edges in a random order, and update sequentially the warnings on all the edges of the 
graph, generating the values T) a —fi(t) , using subroutine BP-UPDATE 

7: if |?7 a _>j(t) — r] a -fi(t — 1)| < e on all the edges then 

8: the iteration has converged and generated = Tla—*i(t) 

9: goto line 12 

10: end if 
11: end for 
12: if t = tm ax then 
13: return UN-CONVERGED 

14: else {t < tmax} 

15: return the set of fixed point warnings ri*-,i = f] a — n(t) 
16: end if 



Whenever the SP algorithm converges to a fixed-point set of messages T]*^, one can use it in a decimation procedure 
in order to find a satisfiable assignment, if such an assignment exists. This procedure, called the survey inspired 
decimation (SID) is described in Alg. 3. 

There exist several variants of this algorithm. In the code which is available at [24], for performance reasons 
we update simultaneously all rj belonging to the same clause. The clauses to be updated are chosen in a random 
permutation order at each iteration step. The algorithm can also be randomized by fixing, instead of the most biased 
variables, one variable randomly chosen in the set of the x percent variables with the largest bias. This strategy 
allows to use some restart in the case where the algorithm has not found a solution. A fastest decimation can also be 
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Algorithm 2 : subroutine SP-UPDATE(r/ a ^) 



Input : Set of all messages arriving onto each variable node j £ V(a)\i 
Output : New value for the message r/ a ^i 



1: for every j € V(a) \i do 

2: compute the three numbers: 



n 11 = 



i- n t 1 -^ 



n a-^) 



l- n (i-^o 

n?^a = n (!-^) 

6evO')\o 

if a set like V a s (j) is empty, the corresponding product takes value 1 by definition. 
3: Using these numbers , compute and return the new survey: 



j£V(a)\i 



II" 



n" + it + n° 



If V(a)\i is empty, then r; a ^i = 1. 
4: end for 



(5) 



(6) 



obtained by fixing in the step 12, instead of one variable, a fraction / of the N t variables (the most polarized ones) 
which have not yet been fixed (going back to 1 variable when fN t < 1). 

In Rcf. [11], experimental results are reported which show that SID is able to solve efficiently huge SAT instances 
very close to the threshold. More specifically, the data discussed in Ref. [11] show that if SID fixes at each step only 
one variable, it converges in <d(N 2 \ogN) operations (the time taken by walksat [18] to solve the simplified sub-formula 
seems to grow more slowly). When a fraction of variables is fixed at each time step, we get a further reduction of the 
cost to 0(N(logN) 2 ) (the second log comes from sorting the biases). 

In the following sections we show how the SP procedures can be made fully distributed and hence amenable to a 
parallel implementation. 



IV. DISTRIBUTED SP I: DIFFUSION ALGORITHM 



A. Simulating SP in parallel 

The SP algoriths described in the preceeding section is by constructions distributed algorithms, since updates of 
nodes are performed using message passing procedures from nearest neighbors. However when one uses the surveys 
in order to find the SAT assignments, in the SID procedure, the decimation process breaks this local information 
exchange design: it requires a global information, namely the maximally polarization field, used to decide which node 
has to be frozen at first. It is the purpose of this section to define a procedure able to diffuse such a global information, 
by using the message passing procedure. 

In the present distributed implementation of SP (see Alg. 4), each node is responsible for its own information, 
any sort of centralized information is prohibited. The information stored at a given function node a is the set of 
surveys {i~) a ^i,i £ ^( a )}- I n turn, a given variable node i, keeps an information which is the set of cavity-fields 
{IL.^ a ,a 6 V(i)}. The scheme amounts to implement a message passing procedure such that function nodes send 
their {rj} values to ncigbouring variables nodes, and variable nodes send their fields values in return. Each time a node 
receives a message from its neighbours, it has to update its own information, accordingly to the survey propagation 
equations. The broadcasting of information for each individual node is supposed to occur at random. Our simulation 
proceeds by random updating of each variable or function node. The update time-scale, that is the rate at which nodes 
in a distributed implementation are able to collect new information and perform their own update will be denoted 
by r. Typically, as we shall see the full decimation process is of the order of 10 4 r, which means that in average each 
node has to update 10 4 time before the process is completed. 
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Algorithm 3 : SID 

Input : The factor graph of a Boolean formula in conjunctive normal form. A maximal number of iterations 
t max and a precision e used in SP 

Output : One assignment which satisfies all clauses, or 'SP UNCONVERGED ' , or 'probably UNSAT' 



1: Random initial condition for the surveys 

2: run SP 

3: if SP does not converge then 

4: return 'SP UNCONVERGED ' and stop (or restart, i.e. goto line 1) . 

5: else {if SP converges} 

6: keep the fixed-point surveys r;*_ ti ; goto line 9 

7: end if 

8: Decimate: 

9: if non-trivial surveys ({77 7^ 0}) are found then 
for each variable node i do 



10: 
11: 



evaluate the three 'biases' {W q 



(+) 



(-) 



W$ 0) } defined by: 

n+ 



(-) 



n+ + nr + n« 

nr 



n+ + n- + n° 



w (o) = i_w 



(+) 



where 11+,^ , f[° are defined by 



n 



n- 



n 



1- n (i-^^i) 

aSV + (i) 

1- n (i-^-o 

o£V_(i) 

n (1-^) 

a€V(i) 



(") 



n a 

a£V-(i) 

n a 

aGV + (i) 



(7) 

(8) 
(9) 



(10) 



12: 



fix the variable with the largest \Wj; +) - W} '\ to the value x, = 1 if VK^" 1 > ; , to the value 

i, = if < 

Clean the graph, which means: toremove the clauses satisfied by this fixing, to reduce the clauses 
that involve the fixed variable with opposite literal, and to update the number of unfixed 
variables . 
end for 

else {if all surveys are trivial ({77 = 0}) } 

output the simplified sub-formula and run on it a local search process (e.g. walks at ) . 
end if 

if the problem is solved completely by unit clause propagation then 

return ' 'SAT" and stop 
else if no contradiction is found then 

continue the decimation process on the smaller problem (goto line 2) 
else {if a contradiction is reached} 

return 'probably UNSAT' 

stop 
end if 



(+) 



A-) 



13: 



14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 



B. Propagation of information 

Messages can contain additional information to the one needed to run SP. In the present case, the decimation 
procedure diffuses the information concerning the highest polarization fields H max = Max(|iy + — W~ |) in the system, 
in order that the variable node, being aware to be the most polarized one, can decide to freeze its assignment to the 
one given by the orientation of the polarization field. 
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Algorithm 4 : diffusive decimation algorithm 
Input : The factor graph of a Boolean formula in conjunctive normal form; a maximal computational time 
tmax , a requested precision e, a number n su of succesive update with maximum polarization required to freeze 
a variable, the damping factor S for information decay, a random set of cavity biases T] a —>i . 
Output : UN-CONVERGED if the algorithm has not converged after t max • If it has converged: one 
assignement which satisfy a fraction of the clauses, and a remaining reduced factor graph corresponding to 
a paramgnetic state which can be solved calling ualksat 

1: At time t = 0: 

2: Initialize the counter n su (i) = 0, i = 1 . . . N 

3: parallel update of each variable node using VARIABLE-NODE-UPDATE. 
4: for t < t max AND H m ax > e do 

5: parallel asynchronized update of each variable node using VARIABLE-NODE-UPDATE and each function node 

using FUNCTION-NODE-UPDATE, at a mean rate of time r per node. 
6: end for 
7: if H max < e then 

8: return the set of assignments of the frozen variables and the remaining reduced factor graph of 

unfrozen variables . 
9: else 

10: return UN-CONVERGED 
11: end if 



Algorithm 5 : subroutine VARIABLE-NODE-UPDATE 

Input : Set of all cavity bias surveys {r] a ^i, a G V(i)} arriving at a given variable node i, information 
H max (a) on the maximum polarization comming from each neighbor a £ V(i) 

Output : New value for the set of probabilites {II* +a , LT°_^ a , a € V(i)} , new information H max (i) on the 
maximum polarization as estimated by i 

1: for every a £ V(i) do 

2: compute the values of n j _ >OJ Il| l _ >a using eq. (5). 

3: end for 

4: evaluate the two 'biases' {W} + \ W^} using eq. (7,8): 

5: if H max (a) < \W^ +) - W^l+e then 

6: H max {i) = \wl +) - Wt ] \ 
7: n su (i) = n su (i) + 1 
8: if n su (i) > n su then 

9: freeze variable i to the value x t = 1 if Wj; +) > W}~ ) , to the value x t = if W^ +) < 

10: end if 

11: else 

12: H max {i) = Ma.x(H max (i), H max (a)) * (1 - <5) 
13: n BU {i) = 

14: end if 



A static information can freely propagate by simply diffusing in the network, and any component of the network 
should be informed by a time which has to be proportional to log(A r ), N being the size (number of variables nodes) 
of the system. The difficulty is that the information to be broadcasted may still vary with time before the system 
has reached equilibrium. The difficulty in propagating a signal saying that the system has converged, might be even 
more severe. Indeed this amounts to propagate the maximum displacement from the last update of each individual 
node, which is by essence a transient information. 

To take advantage of the basic message-passing procedure, in the case in which the information to be diffused can 
vary with time, we have to add a mechanism able to suppress obsolete information. The way we propose to do this 
is to incorporate some damping factor in the broadcasted values. 

Let us call 8 <C 1 this damping factor. The procedure goes as follows (see Alg. 5) Each node (a for a function 
node or i for a variable node) stores its own estimate of the max polarization field H rnax in a local variable H max (i) 
or H max (a). Consider first a variable node i. When this updates, in addition to the set {rjb—fi} from its neighbour 
function nodes 6, it collects also the information H max (b) about the maximum, coming from these nodes, compare it 
to its own estimation H max (i), and also to its own field value Wi = \W^~ — W~\. In case the new H max {i) estimation 
indicates that i is not the most polarized variable node (H max (i) > Wi), H max (i) is multiplied by a factor (1 — S). If 
instead, it appears that i might be the most polarized node {H max (i) = Wi), then H max (i) is kept as it stands, with 
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Algorithm 6 : subroutine FUNCTION-NODE-UPDATE 

Input : Set of all probabilites {n*_, a , Tl®_ a , i £ V(a)} arriving at a given function node a, information 
H ma x(i) on the maximum polarization comming from each neighbor i 6 V(a) 

Output : New value for the set of cavity bias survey {r/ a ^i,i £ V(a)} , new information H max (a) on the 
maximum polarization as estimated by a 

1: for every i £ V(a) do 

2: Compute r/ a ^i using eq. (6) . 

3: Hmax(a) = Kax(Hmax{i), Hmax(a)) * (1 — S) 

4: end for 



no damping correction. For a function node, when it updates, we apply this damping factor anyway. The virtue of 
this damping effect, is to eliminate any false information. Suppose indeed, that at a given time, a strongly polarized 
variable node diffuses in the network a very high value of H max ; if after some time this value is not confirmed (because 
this node converges to a less polarized state) , which means that all local fields Wi will be smaller than H max , and no 
unit in the network can pretend to be the most polarized one. Then, mechanically the estimation on the maximum will 
decay at a rate 5, until when a variable node presents a polarization field higher than the decaying false information. 
The counterpart is that the damping takes effect spatially on the network. So typically, since the radius of the network 
scales like log (AT), the distribution of the estimated maximum given by the set {H max (i), H max (a)} taken over the 
network, will be enlarged by some factor <51og(A). 

As a result when playing with the parameter 5, two contradictory effects are at work: 

• either the information diffuses rapidly with a poor precision (large 6), 

• cither it takes a long time, before a precise information is diffused (small 5). 

When it comes to the decimation procedure, this can be actually turned into an advantage, since it provides us 
directly with a parameter able to fix, in an heuristic manner, the rate at which the decimation is performed. Each 
variable node is equipped with a counting variable n su (i) initially set to zero and counting successive updates with 
the following two conditions: 

• the variation since last updtatc of the IT's is less than e (convergence criteria, typically e ~ 10 -3 ), 

• i may consider itself to be the most polarized variable (H max (i) = \W^~ — W~\). 

When one of these conditions is not satisfied, n su (i) is set to zero. If n su (i) is greater than a parameter n su (i), the 
variable is allowed to freeze in the direction given by — W[~ . 

C. Experimental results 

We have simulated the behavior of the algorithm using a standard computer, which allows to monitor how much 
progress in computing time would be brought by its fully parallel implementation, and to optimize the choices of 
parameters. 

We use the random 3-SAT formulas as a benchmark problem. Let N denote the number of variables nodes and 
a = M/N the ratio of the number of function nodes M over the number of variable nodes. The range of interest is 
a € [4, 4.25]. Simulations are performed with samples of 10 5 function nodes. Figures ( 2, 3, 4, 5) present some average 
over numerous decimation runs, with various parameters. The runs stop when a paramagnetic state is reached, or 
when the entropy becomes negative. In order to assess the performance of the algorithm, we measure the simulated 
time (the time it would take if the computation was really distributed) to reach a paramagnetic state, which can be 
handled afterwards by walksat in order to find a solution. 

The observables which we consider are 

• N a the number of active (unfrozen) variables nodes. 

• M a the number of active function nodes. 

• <xi the ratio M a ijN a where M a2 is the number of active two-clauses. 

• a3 the ratio M a3 /N a where M a3 is the number of active three-clauses. Since we start from a 3— SAT problem 
we have of course M a = M a2 + M a3 . 

• the entropy of the system, computed from the expression given in [11] 
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Figure 2: Number of active clauses as a function of time for different values of 8 for a 3-SAT problem of size N = 10 5 at a = 4.2 
and n S u = 20. 
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Figure 3: Decimation time up to the paramagnetic state as a function of the size of the system, for a 3-SAT problem at a = 4.2, 
S = 0.005 and n su = 20. 



1. Orders of magnitude for the tunning parameters 

The algorithm depends on two tunable parameters 

• the damping factor 5 on the diffusion of information. We find that 6 e [0.001,0.01] is roughly the range of 
effectiveness of this parameter for N = 10 5 . This actually gives essentially the decimation rate between two 
convergences of SP. For 6 > 0.01, too many variables (~ 10 3 ) are frozen at the same time, and the algorithm 
ceases to find a paramagnetic state. Instead, when d < 0.001, variables are decimated one by one. The time- 
dependence of the decimation process with this parameter is illustrated in fig. 2. 

• n su the number of successive successful updates needed for a variable, to get frozen. It depends on the precision 
e which is required. For e = 10~ 3 it is found that the range of validity of this parameter is between 10 and 
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Figure 4: Rate of succesfull decimation process as a function of N for a 3-SAT problem at a — 4.2, 8 — 0.005 and n su — 20. 
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Figure 5: Number of active clauses as a function of time during the decimation process for different values of a, for a 3-SAT 
problem of size N = 10 J at S = 0.01 and n S u = 10. 



50. Above 50, SP converges before the next decimation cascade. Below 20 the decimation process is essentially 
continuous. Below 10 the algorithm ceases to converge to a paramagnetic state. 

A reasonable choice for the selection of parameters leading to a fast but still successful decimation procedure is 
around 6 = 0.01 and n su = 10. 



2. Dependence with the size of the system 



Once the typical value for the tunable parameter are determined, it is possible to study the behavior of the algorithm 
with the sixe of the problem, N . Fig. 3, shows that the dependence in N is of the order of logiV, which is the best 
performance one could have with such an algorithm. Another expected result is that rate of successful decimation 
should increase when N increases, since finite size effect arc known to alter efficiency of algorithms for solving 3-SAT 
formulas. This is what is effectively observed in fig. 4 
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alpha2 

Figure 6: Trajectories in the phase space (02,03) during the decimation process for various values of a for a 3-SAT problem 
of size N = 10 J at S — 0.01 and n su — 10. (5 samples for each value of a, all decimations processes where succesfull up to 
a — 4.2, and all failed for a — 4.25) The slope of the linear part of the trajectories is ~ —2.4. 



3. Dependence with a 

Depending on a we observe that when we approach the transition point a c ~ 4.26, the final decimation rate 
increases, (see fig. 5) which is consistent with the hypothesis that the size of the backbone of solutions clusters 
increases when we approach the critical value of a. This can be also visualized when looking at the phase space 
trajectory of the decimation process (see fig. 6. During the process 2-clauses are generated, which leads to represent 
the trajectory in the plane (0:2, 03). The trajectory is shorter when we go away from a c . 



DISTRIBUTED SP II: 'REINFORCEMENT ALGORITHM' 



A. SP with external forcing field 



The SP procedure has recently been generalized to allow the retrieval of a large number of different solutions thanks 
to the introduction of probability preconditionings [9, 10]. The original factor graph is modified applying onto each 
variable i a constant external forcing survey in direction £j e { — 1,1} and intensity i]f xt — ir. An a priori probability 
7r of assuming the value ±1 is being thus assigned to the variable i. The components of the forcing modify the usual 
SP equation (5) in the following way : 



it 



i-(i-7r^,±) n (l-vb^j) 

beV ± ( 3 )\a 

•(i-^,=f) n t 1 -^) 

(l-7r% )+ )(l-7r%,_) [] t 1 -^) 

beV(j)\a 



(11) 



These relations, together with (6), form the SP equations with external messages (SP-ext). The quantities 
n^,n~,II° used to define the local biases on each variable i (see Eq. 10) are modified in a similar manner and 
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expressed as a function of the new probabilities rj^- at the fixed point : 



IT 



i-(i-tt%, ± ) n 

bev ± (j) 

(i-^w) n (!-%%■) 

6eV T (j) 

&ev(j) 



(12) 



The external messages drive the value of the surveys r?*^ towards the clusters maximally aligned with the external 
vector field, tt = corresponds to the case in which the external messages are absent ; in the case n = 1, the system 
is fully forced in the direction £. When tt is correctly chosen, this enables to address more specifically (compared to 
the SID algorithm, which permits to retrieve only one solution) clusters close to a given region of the iV-dimcnsional 
space. Thus, a right choice of the a priori probability tt, and the use of alternative random directions £ invariably 
permit to retrieve many solutions in many different clusters. 

The criterion used to decimate towards a solution is still the global criterion of the maximum of the magnetization 
over all variables. So, the decimation based on the SP-ext equations is not yet a distributed solver of the K-SAT 
formulas. 



B. Presentation of the method 



In order to make the decimation procedure fully distributed, we may adopt time-dependent external messages 
which are updated according to a reinforcement strategy. More precisely, good properties of convergence are obtained 
adopting the following scheme : 

• in one time step (sweep), all the r/s are parallely updated using the SP equations with external messages(6, 11), 
which in this scheme are time-dependent, 

• every second time step, one calculates all temporary local fields {wj; + \ w\ , W^} using the equations (7, 8, 
9, 13) — note that these are also functions of the external messages — . The directions £j, Vi G {1, . . . , N} of the 
temporary local messages, which are now time-dependent, are parallely updated to equal sign(W^~ — W~), i.e. 
to align itself with the local bias — W~ . 

As we shall see, with an appropriate choice of the intensity tt of the forcing, most variables are completely polarized 
at the end of a single convergence. A solution of the K-SAT formula is finally found by fixing each variable in the 
direction of its local bias. The reinforcement algorithm (RA) defined above is purely local and hence gives an efficient 
distributed solver for random K-SAT formulas in the hard-SAT phase. 

The algorithm is precisely described in Alg. (7) . Its interpretation and a preliminary study of some of its properties 
are discussed in the subsequent sections. 

To understand how the reinforcement strategy is working, one has to look back at the interpretation of the SP 
equations with an external forcing survey : imposing an external forcing survey of direction £j G { — 1,1} on the 
variable i tries to drive the solution of the cavity biases distributions towards the clusters with a local orientation 
matching the externally imposed direction £j. 

Then, choosing £ parallel to the vector of the temporary local fields is a strategy which typically guarantees that, 
if the intensity tt of the forcing is small enough, the overlap between the forcing vector field £ and the closest solution 
is continuously increasing, in a stronger way than in the case of the usual SP equations. 

When tt is correctly chosen, some satisfying assignments tend to become stable fixed points of the RA dynamics. 
The optimal value of tt is a trade-off between the need for convergence of the SP equations with external forcing 
survey, the strength of the stabilization induced on the satisfying assignments by the update rule and the amplitude 
of the perturbation brought by the external forcing. There is a priori no guarantee that there exists a particular 
value of the intensity tt for which the algorithm, starting from random initial conditions, gets trapped by a single 
cluster, and eventually converges to a single solution inside this cluster, without any further decimation. But, our 
experimental results show that, for N big enough and for almost all values of the parameter a corresponding to the 
SAT phase, there exists an intensity tt for which the overlap between the set of local fields and the closest satisfying 
assignment increases until when it eventually reaches 1 at the end of a unique convergence. 
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Algorithm 7 : reinforcement algorithm 
Input : The factor graph of a Boolean formula in conjunctive normal form; a maximal number of iterations 
tmax , a requested precision e and the probability ty that the external message sends a warning 
Output : UN-CONVERGED if the algorithm has not converged after t max sweeps. If it has converged: one 
assigment which satisfies all clauses 

1: At time t = 0: 

2: for every edge a — > i of the factor graph do 

3: randomly initialize the cavity biases n a ^i(t — 0) £ [0,1]. The additional surveys are initially set to 

Qi(ui,0) 
4: end for 

5: for t — 1 tO t = t max do 

6: update parallelely the cavity bias surveys on all the edges of the graph, generating the values 
?7o-.i(t). using subroutine CBS-UPDATE. 

7: half of the times, update parallelely the reference direction £j 6 { — 1,1} for all variables Xi, using 
subroutine FORC-UPDATE. 

8: if \r] a ->i(t) — T) a ->i(t — 1)| < e on all the edges then 

9: the iteration has converged and generated ?7o_>j = T]a—>i(i) 

10: goto line 13 
11: end if 
12: end for 
13: if t = t m ax then 
14: return UN-CONVERGED 

15: else {If t < tmax} 

16: return the satisfying assignment which is obtained by fixing the boolean variable x\ parallel to its 
local field : Xi — sign{W^ — W\ }, and by performing the simplifications over the neighboring 
variables 

17: end if 



Algorithm 8 : subroutine CBS-UPDATE (r^^) 

Input : Set of all cavity bias surveys arriving onto each variable node j G V(a) \ i 
Output : New value for the cavity bias survey r/ a ^i 

1: for every j 6 V(a) \i do 

2: compute the values of TU^_> a , nj_^ a , n°_, a using eq. (11). 

3: Compute r\ a ~^i using eq. (6) . 

4: end for 



C. Results 

1. Determination of the optimal intensity of the forcing n 

The efficiency of the method is tested on random 3-SAT formulas. One first determines the optimal intensity 7r op t, 
which is the intensity of the forcing which permits to find a satisfying assignment in the minimum number of steps. 
When the density of ground states clusters increases (for lower values of a), ir opt should be set at a higher value to be 
able to bias the convergence towards a unique cluster ; however, a too high value of 7r may let the algorithm converge 
too rapidly in different parts of the graph towards assignments belonging to different clusters. The algorithm would 
then loop without convergence among these clusters. For these two reasons, n op t should be positively correlated with 
the complexity S = log ^ al ' , 



Algorithm 9 : subroutine FORC-UPDATE(^) 

Input : Set of all cavity bias surveys arriving onto the variable node i, including the additional survey 

Qi{e u ;-n) 

Output : New value for the direction £i of the additional survey 

1: Compute the local fields W.^ +) , W^'\ W- 0> using eq. (7, 8, 9, 13) 
2: Compute & = sjgn(W i (+) - W"/" 5 ) 
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Figure 7: Scaling of the optimized it against the complexity E. The optimized 7T, which is the intensity of the forcing permitting 
to find a satisfying assignment in the minimum number of steps, is recalculated for each formula : it depends in average linearly 
on the complexity : it — 11.1 x E (N = 10 5 , K = 3, 4.0 < a < 4.24). Each point corresponds to a different formula ; note that 
the relationship 7r opt = /(E) becomes slightly sublinear for E < 0.003. The subplot shows the evolution of the energy of the 
dump solution (in which each variable is fixed parallel to the temporary local field) that would be obtained if the convergence 
was stopped at time t (a = 4.24). 



One observes that the reinforcement process diverges if the parallel update of the forcings is performed at a too 
high frequency with respect to the frequency of the parallel updates of the 77s ; the local loops in which the algorithm 
gets trapped are likely due to simultaneous convergence towards different satisfying assignments in different parts of 
the graph. A good strategy, systematically used for the experiments of the present paper, is to perform the parallel 
updates of the forcings once every second parallel update of the 77s. 

In the hard-SAT phase, one determines Tr opt as a function of the calculated complexity [17]. As predicted, the 
experiments show that, for < a < 4.245, -K pt is positively correlated, and in average scales linearly, with £ 
{n pt = 11.1 x £, c.f. Fig. 7, each point representing a different formula), and that the probability of converging 
towards a satisfying assignment using this value tends to 1 (N = 10 5 , e = 10 -3 , t max = 1000, c.f. Fig. 8). 



2. Range of validity of the method 

In this section we compare RA with the so-called serialized survey propagation-inspired decimation (SID). 

The figure 8 plots the percentage of successful runs (i.e. of convergence towards a satisfying assignment after a single 
run) of both the synchronous and the asynchronous RA as a function of a. One calls synchronous RA the algorithm 
previously described (Alg. 7) in which the update of all 77s, as well as respectively the update of all forcings, are 
performed simultaneously. Conversely, in the asynchronous RA, the updates are performed sequentially in a random 
permutation order (note, that, in the latter case, forcings can be updated as frequently as the 77s). For a — 4.24, both 
algorithms invariably permit to retrieve a solution. This is in accordance with the result presented in Ref. [11], in 
which the classical SID has been shown to always find a solution by fixing 0.125% of variables after each convergence 
of the SP equations. 

The cut-off value of the complexity S, for which the decimation converges in half of the cases, is S1/2 = 0.0013 
(subplot of Fig. 8). £1/2 was constant for all sizes of the graph TV > 2.10 4 . When the complexity decreases, the 
decimation progressively fails. Depending on the value of 7r, either it converges towards a set of local fields, which 
corresponds to a weighted average of the contributions of numerous clusters of solutions ; or, it loops among low 
energy configurations (usually, for N = 10 5 , a = 4.25, one finally finds assignments of energy E = — 50). 

Moreover, asymptotically, the numerical critical a above which the SID, without backtracking, should not permit 
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Figure 8: Percentage of successful runs (convergence leading to a satisfying assignment) as a function of ct. The subplot shows 
the same percentage as a function of the calculated complexity E. The cut-off complexity, for which the RA is successful in half 
of the cases tested, is £1/2 = 0.0013 (N = 10 5 ,e = W~ 3 ,tm ax — 1500). The range of validity of the reinforcement algorithm 
thus matches the one of the SID (see text). 



to find a satisfying assignment has been estimated to be a a = 4.252 [19], substantially smaller than the theoretical 
critical value ac = 4.2667 (for the improvement brought by the backtracking procedure, see [11, 20]). 

One may study the behavior of the RA at a a for N = 10 6 . The calculated complexity of the formulas with N = 10 6 
and a = 4.252 equals X = 0.00133 ± 0.00013, in agreement with the cut-off complexity determined for formulas of size 
A" = 10 5 (Fig. 8). For A" = 10 6 and a = 4.252, one then sets TT opt = 10.5 x £ (in agreement with the curve of Fig. 7 
for A" = 10 5 ). The "reinforcement algorithm" using these parameters nicely permits to converge towards a solution 
in 10 out of the 15 formulas tested. 

One concludes that the present algorithm is valid at least in the same range of parameters as the classical SID [11]. 



3. Which solutions are addressed? 



The satisfying assignments found by the present algorithm arc highly dependent on the initial conditions, i.e. on 
the initial random values imposed on the 77s. For different random initial conditions, one converges towards solutions 
belonging to different clusters. 

To know where the found solutions are located, one chooses an instance from a random graph ensemble in which 
the number of clauses where any given variable appears negated or directed are kept strictly equal (bar-balanced 
formulas). For such an instance (N = 30000, a = 3.33), the clusters of solutions are expected (and have been 
observed) to be uniformly distributed over the whole N-dimcnsional space [10]. The RA is launched in such an 
instance 800 times starting from different random conditions. The histogram of the Hamming distances <i(Si, S2) = 
|(1 — jt J2i x i x i. S ^) between any two of the 800 solutions peaks at 0.4995 (with a standard deviation of 0.0118). 

There is no guarantee that the present algorithm is able to address all clusters of solutions, but the fact that 
the distance distribution among the found solutions of a bar-balanced formula peaks at 0.5 with a standard devi- 
ation expected from a random distribution (i.e. ^= = 0.0115), suggests that the addressable solutions could be 
homogeneously distributed over the space of the clusters of solutions. 
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Figure 9: The number of parallel sweeps needed to find a solution depends logarithmically on N (a — 4.22, n = 0.04). The 
examples in which the convergence fails (due to finite size effects, and which have not been observed for N > 30000) are not 
represented. 



4- Logarithmic dependence of time on the size of the graph 

One now analyses the evolution of the convergence time as a function of the size of the graph. For this, one 
determines, for a fixed value of a = 4.22, the simulated time that the convergence would take if implemented on a 
distributed device, as a function of the size N of the graph (Fig. 9). For N ranging from 9000 to 9.10 5 , the time 
of convergence is well approximated by a logarithmic fit [t = —315 + 67 * log(N)\. The unit of time is one parallel 
update of all 77s. Each point corresponds to a different random formula. Note that, above N > 3.10 4 , all trials were 
successful, and the RA invariably leads to a satisfying assignment. For smaller graphs, due to finite-size effects, the 
algorithm either fails to converge or converges to an assignment with a few contradictions (generally 1 to 4) in few 
cases: at most 5% of the experiments for N = 9000 and none for N > 30000. 

As local messages need a logarithmic time to reach all parts of the graph, and as, in the hard-SAT phase, the 
assignment of a value to a given variable i does not depend on purely local constraints but on the state of the whole 
graph, one expects that, by the use of purely local messages on a distributed device, the problem can not be solved 
faster than in a logarithmic time. 



5. Performance of the reinforcement algorithm on a serialized implementation 



Numerical experiments show that RA scales as N x log(iV) when implemented on a serialized computer. In the 
scheme of the classical SID, a slightly slower scaling, in N x (log(iV)) 2 , can be achieved by fixing after each convergence 
a given percentage of the total amount of variable [11]. 

Moreover, Ref. [11] reports that, for a = 4.24 and N = 10 5 , by fixing at least / = 0.125% of the N variables at the 
end of each convergence, the SID reaches a paramagnetic state after an average of 7460 sweeps, i.e. 7460 updates of 
all remaining 77s. Conversely, the synchronous RA only needs 600 parallel updates of the 77s plus 300 parallel updates 
of the forcings to converge towards a solution. Its asynchronous version requires 450 — 500 complete updates of the 
rjs and of the forcings to arrive at a solution. This means that, in a serialized computer, close to the critical point, 
the present algorithm performs faster than the classical SID by a factor of 5. This can be checked by running the two 
algorithms on a 2.4 GHz PC : the formula (N = 10 5 and a = 4.24) is solved in average in 30 minutes using the SID 
with / = 0.125% and in 7 minutes using the "reinforcement algorithm" (either asynchronous or synchronous). 

We finally characterize the slowing down of the RA when approaching the critical value ac = 4.2667 (Fig. 10). 
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Figure 10: Time of convergence as a function of E (left) or a (right), using for 7r the optimal intensity of the forcing (N = 10 ). 
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Close to the critical point, the time of convergence is shown to be inversely proportional either to E or to 1 — The fit of 
-P over E £ [0, 0.008] gives = 1.474 and j3 = 0.9994. The fit t = tg X (1 - ^)~ 
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the datas t 
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1.057. 



Far away from the critical point, the time of convergence is highly reliable. Close to the critical point, the algorithm 
gets slower and the time of convergence becomes more variable. Setting the intensity of the forcing at its optimal 

value, one finds that, for a < a a, the time of convergence is inversely proportional to the complexity £ : t = -g- 
(t$ = 1.474, Fig. 10, left panel) ; moreover, as expected from the pseudo-linear relationship between S and a, the 
time of convergence also scales with a 2 a : t = a ^ a (fg = 3.02, Fig. 10, right panel). 

Very close to ac (& > o-a), the RA is unable to find ground states and only explores quasi-optimal assignments. 



VI. CONCLUSION 



This study aimed at making fully distributed the Survey Propagation decimation algorithm. 

The first distributed solver, the "SP diffusion algorithm" diffuses as dynamical information the maximum bias 
over the system, so that variable-nodes can decide to freeze in a self-organized way. Its properties (solving rate, 
final number of active clauses when it reaches the paramagnetic state) are comparable with the previously described 
serialized "SP Inspired Decimation" . The new feature is that the simulated time of convergence towards a solution, 
if this was implemented on a fully distributed device, goes as log(TV), i.e. scales optimally with the size of the graph. 

The second solver, the "SP reinforcement algorithm" , makes use of time-dependent local external forcings, which let 
the variables get completely polarized in the direction of a solution at the end of a single convergence. The estimated 
time of convergence of this solver, when implemented on a distributed device, also goes as log(iV). 

The present algorithm is the fastest existing solver of the K-SAT formulas, when implemented either on a distributed 
device [22] or on a serialized computer. 

Moreover, the strategies proposed in the present manuscript are not specific to the K-SAT problem, but are likely 
to apply to many types of NP-complete constraint satisfaction problems (CSP) with direct applications (e.g. data 
compression algorithm [1]). 

The different codes are available at ([23], [24]). 
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